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Abstract 

In  this  paper,  we  propose  an  algorithm  to  solve  the 
Nash  equilibrium  solution  for  an  n-person  noncooper¬ 
ative  dynamic  game  by  the  extremum  seeking  control 
approach  with  sliding  mode.  For  each  player,  a  switch¬ 
ing  function  is  defined  as  the  difference  Between  the 
player’s  cost  function  and  a  reference  signal.  The  ex¬ 
tremum  seeking  controller  for  each  player  is  designed  so 
that  the  system  converges  to  a  sliding  boundary  layer 
defined  in  the  vicinity  of  a  sliding  mode  correspond¬ 
ing  to  the  switching  function  and  inside  the  boundary 
layer,  the  cost  function  tracks  the  reference  signal  and 
converges  to  the  Nash  equilibrium  solution. 

Keyword:  Noncooperative  Game,  Nash  Equilibrium 
Solution,  Extremum  Seeking,  Sliding  Model 


1  Introduction 

For  an  n-person  noncooperative  dynamic  game,  each 
player  defines  a  cost  function  and  adjusts  some  of  the 
control  parameters  to  minimize  his  own  cost  function 
[1]  [2]  to  find  a  Nash  equilibrium  solution.  When  the 
cost  function  as  a  measurable  variable  or  a  combina¬ 
tion  of  some  measurable  variables  can  not  be  exactly 
formulated,  i.e.  when  the  form  of  the  cost  function  is 
not  given  mathematically  although  it  is  measurable,  ex¬ 
tremum  seeking  control  with  sliding  mode  can  be  used 
to  solve  for  the  Nash  solution. 

Extremum  seeking  control  approaches  have  been  pro¬ 
posed  to  find  a  setpoint  and/or  track  a  varying  set- 
point  so  that  a  cost  function  (which  may  be  unknown) 
of  the  system  reaches  the  extremum[3][4][5][6].  The 
extremum  seeking  controllers  with  sliding  mode  have 
been  proposed[7][8][9][10],  and  can  be  explained  by  the 
configuration  in  Figure  1.  The  switching  function  s(t) 
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is  defined  as 

s(t)  =  y(t)-g(t) 

where  g(t)  is  a  reference  signal.  The  setpoint  for  the 
minimum  (or  maximum)  can  be  reached  no  matter  how 
the  plant  changes.  With  this  control  method,  the  slid¬ 
ing  mode  s(t )  =  0  happens,  the  system  oscillates  in 
the  vicinity  of  the  sliding  mode  s(t)  =  0,  i.e.  oscillates 
inside  a  sliding  boundary  layer  \s(t)\  <  e,  and  a  mini¬ 
mum  or  maximum  point  can  be  reached  in  the  sliding 
mode,  as  shown  in  Figure  1.  Designing  an  extremum 


Figure  1:  Extremum  Seeking  Control  Using  Sliding  Mode 

seeking  controller  for  each  player  ensures  that  the  dy¬ 
namic  game  system  converges  to  the  Nash  equilibrium 
point. 

The  arrangement  of  the  paper  is  as  follows.  Section  2 
describes  the  problem  formulation;  Section  3  proposes 
the  extremum  seeking  algorithm  using  sliding  mode  to 
calculate  the  Nash  equilibrium  solution;  and  Section  4 
gives  simulation  results. 


•  tr  2  .Problem  Formulation 

^  Consider  an  n-person  •  noncooperative  dynamic  game 
described  by  a  nonlinear  system 

-^*x(t)  =  f(x(t ),  Ui(t),  U2(t),  •  •  • ,  «n(*))  (1) 

at 

with  cost  function  for  i-th  player 

Ji(t)  =  Ji(x(t)),  (i  6  N)  (2).J  <  \ 

*  *> 

where  N  is  the  index  set  of  player  defined  as  '  .  ^ 

N  =  {1,2,  •••,«•},  ' 

x(t )  6  Rm,  Ui(t)  €  R  (i  €  N),  and  Ji(t)  €  R  (*  €  N) 
are  the  state  variable,  the  i-th  player’s  control  input, 
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and  the  i-th  player’s  cost  function,  respectively.  The 
functions,  /(*(*), ux(t),  u2(t),  •  • , un(t))  and  Ji(x)  (i  £ 

N)  are  assumed  to  be  smooth. 

Assumption  1  There  exist  smooth  control  laws 

Ui(t)  =  ai(x(t)A),  (ieN)  (3) 

for  all  players  to  stabilize  the  above  nonlinear  system 
(1),  where  0*  €  0*  (i  £  N)  is  a  control  parameter. 

With  the  control  input  (3),  the  closed-loop  system  of 
the  nonlinear  system  (1)  is  determined  by 

— x(«)  =  f(x(t),  ai(x(f),  0i),a2(x(f),  02),  •  •  •  ,a„(x(t),  0n)) 


3  Extremum  Seeking  with  Sliding  Mode 

To  design  an  extremum  seeking  controller  with  sliding 
mode  for  the  i-th  player  (i  £  N),  a  switching  function 
is  defined  as 

8i(t)  =  Ji(t)  -  gi{t)  (4) 

where  the  reference  signal  <?»(£)  £  R  is  determined  by 

ft(t)  =  A(t),  (5) 

where  the  time- varying  parameter  /?*(f)  will  be  given 
later.  Then  a  sliding  boundary  layer  based  on  the  above 
switch  function  is  defined  as 

M*)l  <  (6) 

where  e*  >  0  is  a  small  positive  constant. 


Assumption  2  There  exist  a  smooth  function  xe: 
R  — ►  iP*  such  that 

f(x(t),  ai(x(t),  0i),  a2(x(t),  02), an(x(t),  0n))  =  0 

t 

x  =  xe(0i,02,---,On) 

i.e.,  every  n-tuple  of  the  control  parameters  $i  £ 
0.  (i  £  N)  determines  a  unique  equilibrium  point 

Assumption  3  The  static  performance  map  at  the 
equilibrium  point  xe(0x,02,  *  *  •  ,0n)  from  a  n-tuple  of 
Oi  £  Qi  (i  £  N)  to  Ji(t)  represented  by 

Jf  =  J(*e(01,02,.*,0n)) 

=  — A),  ,(i£N) 

is  smooth  and  has  a  unique  Nash  equilibrium  solution 
at  point  (01,  0 •  •  •,  0*J  such  tkat^p  f| 

(^1  >  ^2 » *  *  * » >  *  *  *  >  ^n)  ^  ' 

VA  €  ©t,  (i  £  N) 

Assumption  4  The  partied  derivative  of  the  static 
performance  map  Jf  (i  £  N)  satisfies 

v^<  (i&N) 

The  control  objective  is  to  solve  the  Nash  equilibrium 
solution  J*(0X ,  0%,  •  •  • ,  0£)  by  adjusting  the  parameters 
0i  by  each  player  (t  £  N)  separately. 


Let  the  variable  structure  control  law  be 

v»(t)  =  -kisgn(si{t))  (7) 

and  the  parameter  0{  satisfy 

=  Vi(t) 

where  k{  is  a  small  enough  positive  constant. 

Assumption  5  The  dynamic  system  given  in  (1)  is 
much  faster  than  the  one  of  the  parameter  0i ’s  adjusting 
process,  i.e . 

l£*»l » l^l- 

Therefore  in  the  design  of  the  extremum  seeking  con¬ 
troller,  the  cost  function  J*(i)  can  be  replaced  by  the 
static  performance  map 

Jf  =  J%{0 1, 02>  *“»  0n)« 

Assumption  5  is  reasonable  once  is  small  enough. 

Assumption  6  The  setpoint  (0\,  02,  •••,  0*)  corre¬ 
sponding  to  the  Nash  equilibrium  solution  is  in  the 
vicinity  of  the  initial  n-tuple  of  0*( 0)  (i  £  N).  Thus 
the  partial  derivative  of  the  cost  function  J*(t)  on  0i  is 
bounded  by  a  positive  constant  7<,  i.e., 

\^-Mei,02,---,0i,---,0n)\<'ti  (8) 

Based  on  the  above  assumptions,  the  derivative  of  the 
switching  function  Si(t)  is  given  by 

— s<(t)  =  y  QQ-Ji(pi,o 2,  •  •  • ,  0n)0j(t) — ji(t) 

=  -Wi(0u  02,  •  •  • , 0n)kiSga(si(t))  -  Pi(t) 


t 


,(,r.  »,*,  ?. 
..  fa'-' ■  r 
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where  W(&u  62,  •  •  • ,  9n)  is  determined  by 


is  the  time  instant  when  Si(£)|t=t0  — 


Wi,02,-A) 

"  0 


i=l  J 


*isgn(*i(*)) 

fcisgn(si(t)) 


According  to  Assumptions  4  and  6  and  if  the  positive 
constant  k{  (i  €  N)  is  bounded  by  some  constant,  it 
can  be  shown  that  Wi{0\ ,  #2>  •  *  *  >  #n)  is  bounded.  To 
simplify  the  notations,  it  is  assumed  to  be  bounded  by 
7 i  (*  e  N)y  i.e. 


I  —  (<€JV) 

Define  a  Lypunov  function  as 

v<(t)  =  (9) 

Then 

=  «2,  •  •  • ,  0«)*J  |*(t)l  -  Si(W(t) 


According  to  sliding  mode  control  theory,  to  ensure  the 
convergence  to  the  sliding  mode  s*(t)  =  0  or  the  slid¬ 
ing  boundary  layer  |s*(£)|  <  €*,  the  above  derivative 
must  be  negative.  Therefore  the  time-varying  parame¬ 
ter  Pi(t)  outside  the  sliding  boundary  layer  (6)  is  chosen 
as 


Si(t)  <  -€i 
Si(t)  >  €i 


(10) 


where  Pi  and  /?*  are  positive  constants  satisfying 

Pi  >  7 iki  +  (Ti 
Pi  >  7 ih  +  °i 


Gi  is  a  positive  constant.  Thus  the  following  holds. 


^(t)  <  -0*|«<(*)|,  M*)l>e<  (11) 

which  means  that  the  system  will  enter  the  sliding 

boundary  layer  \si(t)\  <  e*  in  a  finite  time  and  stay 

there-eiace-then.  j»  L 
ajp&r  ~in# ■  • 

Inside  the  sliding  boundary  layer  |s*(t)|  <  €*,  the  time- 
varying  parameter  Pi(t)  is  chosen  as 

__  /  Pit  ^  <  6* 

“  1  2ei6(t-t0),  si(t)  =  €i 

where  Pi  is  a  positive  constant  satisfying 

Pi>7ih+Vi, 

S(t  —  to)  is  the  impulse  function  defined  as 


j, 

Jto 


S(t-t0)dt  =  1 


(12) 


With  the  parameter  ft(*)  designed  above,  inside  the 
sliding  boundary  |sj(t)|  <  e{  except  for  one  of  the 
boundaries,  Si{t)  =  et,  the  reference  signal  gi(t)  keeps 
decreasing  with 

gi(t)  =  -h- 

At  the  same  time,  the  cost  function  J<(t)  may  increase 
or  decrease  but  the  absolute  value  of  the  change  rate 
of  Ji(t)  is  less  than  the  one  of  &(t)  as 

<  'tiki  <  h  -  Oi  <  Pi  =  |M*)I 

Therefore  Si(t)  =  Ji(t)  -  gi(t)  will  increase,  i.e.,  the 
system  will  move  toward  the  sliding  boundary  sj(t)  =  e* 
and  reach  the  boundary  at  some  time  instant  to-  Then 
with  the  adjusting  rule  gi(t)  =  —2eiS(t—to),  the  system 
will  move  to  the  another  boundary  s»(t)  —  — £>  os 


(to)  =  Mto)  ~  5*(to)  =  €» 

Si(t+)  =  M4)-gi(ti ) 

f* 0 

=  (Ji(to)  -  9i(to))  -  2€i  /  S(t  -  t0)dt 

Jto 

= 


After  then,  the  system  will  move  from  the  boundary 
Si(t)  —  -€i  to  another  one  Si(t)  =  e<,  again  while  the 
reference  signal  gi(t)  keeps  decreasing.  In  this  way,  the 
system  will  vibrate  inside  the  sliding  boundary  layct 


It  is  assumed  that  the  system  reaches  the  boundary 
Si(t)  =  €i  at  time  instants  t  =  88  shown 

in  Figure  2.  If  the  sliding  boundary  layer  is  chosen  to 


> 

V  1 

ft 
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Extremum  Seeking  Control  Sitting  Mod* 


Figure  2:  Extremum  Seeking  Control  with  Sliding  Mode 

be  narrow  enough,  then  it  is  reasonable  to  assume  that 
every  period  [ti,£*+i)  (i  =  0, 1,2,  -  •  •)  is  very  short  so 
that  the  function  Wi(0\(t),  02(*)>  •  •  *  > #n(f))>  denoted  as 

aiij  =  Wi(«i(*),fl2(i),  — ,ft.(«))|*e|*,WCi  =  O’1'2’-") 
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is  a  constant  in  each  period  and  satisfies 

0  =  0,1,2,-.) 


Now  let’s  show  that  the  cost  function  J{(t)  will  decrease 
while  the  system  vibrates  inside  the  sliding  boundary 
layer. 

At  the  time  instant  to, 

$i(to)  =  Ji(t0)  —  <fc(t0)  =  €{. 

Then  at  the  time  instant 

M4)  =  Mto) 

9i(^0  )  =  ffi(to)  +  2fj, 

*(#)  =  -Ci. 

Let  ioi  denote  the  time  instant  when  «i(*)|t=toi  =  0 
(to  <  *oi  <  *i).  For  t+<t<  tgi,  the  followings  hold. 

*(*)  =  Ji(t)  -  9i(t)  <  0, 

Ji(t)  =  -aifikisgn(si(t))  =  Qifiki, 

9i(t)  =  -Pi<  0 

which  yield 

Ji(t)  =  Ji(t0)  +  aifiki(t  - 10) 

9i(t)  =  9i(to)  +  2e<  -  ft(t  - 10). 

According  to 

the  time  instant  to\  can  be  found  as 
foi  =  <o  4*  -= - - - 

Pi  + 

and  J\{t)  and  9i{t)  at  t  =  toi  are  given  by 
Mtoi)  =  + 

Pi  +  Otiflki 

9i(toi)  =  9i(to)  +  2€i-  - 

Pi  +  ocitoki 

For  *oi  <  *  <  *o,  the  followings  hold. 

=  Mt)~9i(t)>0, 

j(t)  =  -Otiflki, 

9i(t)  =  -&<  0 

which  yield 

M*)  =  Ji(toi)  -  diflkift  -  *oi ) 

9i(t)  —  ff<(*oi)  -  h(t  -  *oi) 


According  to 

«i(*i)  =  Ji(t\)  -  ffi(ti)  =  a 
the  time  instant  *i  can  be  found  as 

h  =  *oi  +  -= — - — — , 

Pi  “  ai}Q^i 

and  Ji(t)  and  gi(t)  at  t  =  t\  are  determined  by 

«i,0*f 


ML)  -  Mh)-2,lw-^ 

*('°  -  •**-**%& 
i.e.,  the  cost  function  •/.(*)  and  the  reference  signal  <j»(t) 
decrease  in  the  period  [*o,*i)  as 


Ji(h)  -  Ji(to)  =  — 


aloki 


9i(h)  -  9i(t0)  =  -261-=; 


I  -  Qiflki 


<0 


<0. 


In  a  similar  way,  it  can  be  shown  that  before  the  Nash 
equilibrium  solution  is  reached,  the  followings  hold. 

Mto)  >  Mt\)  >  Mb)  >•••  =  () 

9i(to)  >  9i(ti)  >  gi(t2)  >•••  =  () 

When  the  Nash  Solution  is  reached  at  a  time  instant 
*m,  i.e.  when  £*< j  =  0  (j  =  m,  m  +  1,  m  +  2,  •  •  •),  Ji(t) 
and  gi(t)  will  keep  to  be  a  constant. 


Theorem  1  Consider  the  dynamic  noncooperative 
game  described  by  the  state  equation  in  (1)  with  the 
control  input  in  (3),  the  sliding  mode  controller  with 
extremum  seeking  control  approach  for  the  i-th  player 
(i  €  N)  designed  as 


*i(t)  =  Mt)~9i(t) 

0i=  =  t >i(t) 

Vi(t)  =  —  fcisgnfo) 

—Pi,  «<(*)  <  —€i 


9i(t) 


Pi,  ei  *C  f, 
2ei8(t-t0),  Si(t)  =  €i 

Pi,  Si(t)  >  a 


ensures  that  the  cost  functions  Jt(t)  (i  e  N) 
are  minimized  to  get  the  Nash  equilibrium  solution 

j*(0i,n,--,K). 


Remark  1  The  variable  structure  control  rule  for  the 
i-th  player  (i  €  N)  may  be  replaced  by 

Vi(t)  =  —kiSga(sin(wiSi(t)n/2/ei))y  (i  G  N)  (13) 

where  >  1  is  a  positive  number,  then  the  ampli¬ 
tude  of  the  vibration  on  the  cost  function  Jj(t)  becomes 
smaller  for  larger  c *  ,  which  still  results  in  a  stable 
extremum  seeking  control. 
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Remark  2  To  implement  the  proposed  algorithm  for 
the  sampled-data  system  and  also  to  simplify  the  con¬ 
troller,  the  reference  signal  gi(t)  (i  €  N)  can  be  modi¬ 
fied  as 

«*>-{"&  (u) 

where  pi  satisfies 

kPiTi  =  2e{ 

Ti  is  the  sampling  interval  and  U  is  a  positive  constant 
which  indicates  the  number  of  the  sampling  intervals 
when  gi(t)  =  ft  (i  €  N). 


4  Examples 

Consider  a  two-person  noncooperative  dynamic  game 
described  by  a  second-order  linear  system  with  un¬ 
known  parameters. 


x(t) 


-1  0.2 
0.3  -1 


x(t) 


+ 


0.5(ux(t)  -  2  -  0.1tt2(i))2  + 1.0 
0.7(u2(«)  -  1  -  0.2«i(t))2  +  0.5 


The  cost  functions  for  two  players  are  respectively  de¬ 
fined  as 


Ji(t)  =  xi(t) 

^2  (^)  —  (t) 

The  control  input  is  chosen  to  be  the  control  parameter, 
i.e. 

Ui(t)  =  0i(t),  (i  -  1,2) 


Then  it  is  clear  that  the  Nash  equilibrium  point  is  given 

by 

6\  =  1.4286 
QZ  =  2.1429. 

The  proposed  extremum  seeking  control  algorithm  is 
implemented  for  the  above  system  with  sampling  in¬ 
terval  as  T  =  0.01  second  and  other  parameters  as 


Pi  =  0.005,  fa  =  5.0,  l{  =  2 
6i  =  0.05,  ki  —  0.01,  (i  =  1, 2) 


The  simulation  results  are  given  in  Figure  3,  which 
shows  that  the  system  enters  the  sliding  boundary  layer 
in  a  finite  time  and  then  oscillates  inside  the  layer  while 
the  cost  function  keeps  decreasing  with  oscillation  until 
the  Nash  equilibrium  point  is  reached.  The  amplitude 
of  the  vibration  can  be  reduced  by  choosing  a  smaller 
boundary  layer  e.  Figure  4  are  simulation  results  with 

*  =  0.01,  pi  =  1.(<  =  1,2) 

Using  the  control  laws  given  in  Remark  1,  as  shown 
in  Figure  5,  results  in  higher  control  accuracy  with  a 
larger  constants  u)i  (i  =  1, 2). 


5  Conclusion 


The  extremum  seeking  control  approach  with  sliding 
mode  proposed  in  [7]  [8]  [9]  [10]  was  implemented  in  an 
n- person  noncooperative  dynamic  game  to  calculate 
the  Nash  equilibrium  solution.  With  the  designed  con¬ 
troller  for  each  player  in  the  game,  the  system  enters 
a  sliding  boundary  layer  and  stays  there  while  the  cost 
function  decreases  with  oscillating  behavior,  until  the 
Nash  equilibrium  solution  is  reached.  The  simulation 
result  show  the  effectiveness. 
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Figure  3:  Nash  Solution  by  Extremum  Seeking  Control^  =  c2  =  0.05) 
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Figure  4:  Nash  Solution  by  Extremum  Seeking  Control^  =  e2  =  0.01) 
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Figure  5:  Nash  Solution  by  Extremum  Seeking  Control^  =  e2  =  0.05,  Wl  =  ^  =  ; 


